Game of Thrones (G.O.T) Network Analysis¶

Game of Thrones is a wildly popular television series by HBO, based on the (also) wildly popular book series "A Song of Ice and Fire" by George R.R. Martin. In this case study, we will analyze the co-occurrence network of the characters in the Game of Thrones books.¶

Procedures¶

Load all the raw datasets and perform descriptive analysis¶

Run Network Analysis Algorithms on individual books (and combined)¶

Calculate the different centralities measures and provide inference¶

Create Network Graphs using Plotly¶

Run Louvain Community Detection and find out different groups/communities in the data¶

In [43]:
%matplotlib inline
import networkx as nx
from decorator import decorator
from networkx.utils import create_random_state, create_py_random_state
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
# Remove scientific notations and display numbers with 2 decimal points instead
pd.options.display.float_format = '{:,.2f}'.format        
# Update the default background style of the plots
sns.set_style(style='darkgrid')
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
import plotly.graph_objs as go
import plotly
import plotly.express as px
init_notebook_mode()
import warnings
warnings.filterwarnings('ignore')
import plotly.offline as pyo
import plotly.graph_objs as go
In [41]:
 
An error occurred.
ValueError: Please install Node.js and npm before continuing installation. You may be able to install Node.js from your package manager, from conda, or directly from the Node.js website (https://nodejs.org).
See the log file for details:  C:\Users\Freddy\AppData\Local\Temp\jupyterlab-debug-4dgzyxq_.log

Loading Data¶

In [2]:
os.listdir("raw_data_books/raw_data_books/")
Out[2]:
['book1.csv', 'book2.csv', 'book3.csv', 'book4.csv', 'book5.csv']
In [3]:
book1 = pd.read_csv("raw_data_books/raw_data_books/book1.csv")
In [4]:
book1.head()
Out[4]:
Person 1 Person 2 Type weight book
0 Addam-Marbrand Jaime-Lannister Undirected 3 1
1 Addam-Marbrand Tywin-Lannister Undirected 6 1
2 Aegon-I-Targaryen Daenerys-Targaryen Undirected 5 1
3 Aegon-I-Targaryen Eddard-Stark Undirected 4 1
4 Aemon-Targaryen-(Maester-Aemon) Alliser-Thorne Undirected 4 1

Combining Datasets Together¶

In [5]:
book2 = pd.read_csv("raw_data_books/raw_data_books/book2.csv")
book3 = pd.read_csv("raw_data_books/raw_data_books/book3.csv")
book4 = pd.read_csv("raw_data_books/raw_data_books/book4.csv")
book5 = pd.read_csv("raw_data_books/raw_data_books/book5.csv")
In [6]:
books = [book1, book2, book3, book4, book5]

books_combined = pd.DataFrame()

for book in books:
    books_combined = pd.concat([books_combined, book])

books_combined = books_combined.groupby(["Person 2", "Person 1"], as_index = False)["weight"].sum()

Description¶

In [7]:
books_combined.describe()
Out[7]:
weight
count 2,823.00
mean 11.56
std 19.98
min 3.00
25% 3.00
50% 5.00
75% 11.00
max 334.00

There are 2823 edges in total, or 2823 co-occurrences of characters.¶

The minimum weight is 3 (meaning every co-occurrence pair has been observed at least thrice), and the maximum weight is 334.¶

The mean weight is 11.56, meaning that on average, two co-occurring characters are mentioned around 12 times together. The median of 5 also implies that it is the maximum weight which is more likely the outlier, which is also affirmed by the fact that 75% of the weight values are 11 or lower.¶

In [8]:
books_combined[books_combined["weight"] == 334]
Out[8]:
Person 2 Person 1 weight
1570 Robert-Baratheon Eddard-Stark 334

The maximum number of 334 connections is shown below to be between Robert Baratheon and Eddard Stark.¶

These two arepivotal co-characters in the first book.¶

Graphical Network¶

In [9]:
G1 = nx.from_pandas_edgelist(book1, 'Person 1', "Person 2", edge_attr = "weight", create_using = nx.Graph())
G2 = nx.from_pandas_edgelist(book2, 'Person 1', "Person 2", edge_attr = "weight", create_using = nx.Graph())
G3 = nx.from_pandas_edgelist(book3, 'Person 1', "Person 2", edge_attr = "weight", create_using = nx.Graph())
G4 = nx.from_pandas_edgelist(book4, 'Person 1', "Person 2", edge_attr = "weight", create_using = nx.Graph())
G5 = nx.from_pandas_edgelist(book5, 'Person 1', "Person 2", edge_attr = "weight", create_using = nx.Graph())
G = nx.from_pandas_edgelist(books_combined, 'Person 1', "Person 2", edge_attr = "weight", create_using = nx.Graph())

Number of Nodes & Edges¶

In [10]:
nx.info(G)
Out[10]:
'Graph with 796 nodes and 2823 edges'

Creating functions to calculate the number of unique connections per character, Degree Centrality, Eigenvector Centrality, and Betweenness Centrality¶

In [11]:
def numUniqueConnec(G):
    numUniqueConnection = list(G.degree())
    numUniqueConnection = sorted(numUniqueConnection, key = lambda x:x[1], reverse = True)
    numUniqueConnection = pd.DataFrame.from_dict(numUniqueConnection)
    numUniqueConnection.columns = (["Character", "NumberOfUniqueHCPConnections"])
    
    return numUniqueConnection
In [12]:
numUniqueConnec(G)
Out[12]:
Character NumberOfUniqueHCPConnections
0 Tyrion-Lannister 122
1 Jon-Snow 114
2 Jaime-Lannister 101
3 Cersei-Lannister 97
4 Stannis-Baratheon 89
... ... ...
791 Wynton-Stout 1
792 Bael-the-Bard 1
793 Yorko-Terys 1
794 Yurkhaz-zo-Yunzak 1
795 Zei 1

796 rows × 2 columns

Tyrion Lannister is the character with the highest number of unique connections, followed by Jon Snow and Jaime Lannister.¶

Degree Centrality¶

In [13]:
def deg_central(G):
    deg_centrality = nx.degree_centrality(G)
    deg_centrality_sort = sorted(deg_centrality.items(), key = lambda x:x[1], reverse = True)
    deg_centrality_sort = pd.DataFrame.from_dict(deg_centrality_sort)
    deg_centrality_sort.columns = (["Character", "Degree Centrality"])
    
    return deg_centrality_sort
In [14]:
deg_centrality_sort = deg_central(G)
deg_central(G)
Out[14]:
Character Degree Centrality
0 Tyrion-Lannister 0.15
1 Jon-Snow 0.14
2 Jaime-Lannister 0.13
3 Cersei-Lannister 0.12
4 Stannis-Baratheon 0.11
... ... ...
791 Wynton-Stout 0.00
792 Bael-the-Bard 0.00
793 Yorko-Terys 0.00
794 Yurkhaz-zo-Yunzak 0.00
795 Zei 0.00

796 rows × 2 columns

Tyrion Lannister is the character with the highest Degree Centrality, followed by Jon Snow and Jaime Lannister.¶

The higher the number of connections, the higher the Degree Centrality.¶

Eigenvector Centrality¶

In [15]:
def eigen_central(G):
    eigen_centrality = nx.eigenvector_centrality(G, weight = "weight")
    eigen_centrality_sort = sorted(eigen_centrality.items(), key = lambda x:x[1], reverse = True)
    eigen_centrality_sort = pd.DataFrame.from_dict(eigen_centrality_sort)
    eigen_centrality_sort.columns = (["Character", "EigenVector Centrality"])
    
    return eigen_centrality_sort
In [16]:
eigen_central(G)
Out[16]:
Character EigenVector Centrality
0 Tyrion-Lannister 0.38
1 Cersei-Lannister 0.36
2 Joffrey-Baratheon 0.34
3 Robert-Baratheon 0.28
4 Eddard-Stark 0.28
... ... ...
791 Simon-Toyne 0.00
792 Hugh-Hungerford 0.00
793 Murch 0.00
794 Torwold-Browntooth 0.00
795 Gormon-Tyrell 0.00

796 rows × 2 columns

Tyrion Lannister is also the leader when it comes to Eigenvector Centrality, followed by Cersei Lannister and Joffrey Baratheon.¶

Betweenness Centrality¶

In [17]:
def betweenness_central(G):
    betweenness_centrality = nx.betweenness_centrality(G, weight = "weight")
    betweenness_centrality_sort = sorted(betweenness_centrality.items(), key = lambda x:x[1], reverse = True)
    betweenness_centrality_sort = pd.DataFrame.from_dict(betweenness_centrality_sort)
    betweenness_centrality_sort.columns = (["Character", "Betweenness Centrality"])
    
    return betweenness_centrality_sort
In [18]:
betweenness_central(G)
Out[18]:
Character Betweenness Centrality
0 Jon-Snow 0.13
1 Theon-Greyjoy 0.12
2 Jaime-Lannister 0.12
3 Daenerys-Targaryen 0.09
4 Stannis-Baratheon 0.09
... ... ...
791 Yandry 0.00
792 Bael-the-Bard 0.00
793 Yorko-Terys 0.00
794 Yurkhaz-zo-Yunzak 0.00
795 Zei 0.00

796 rows × 2 columns

With Betweenness Centrality, it is Jon Snow who's at the top.¶

So, Jon Snow is the central character that seems to best connect different, disparate groupings of characters.¶

Graphs¶

In [44]:
def draw_plotly_network_graph(Graph_obj, filter = None, filter_nodesbydegree = None):
    G_dup = Graph_obj.copy()

    degrees = nx.classes.degree(G_dup)
    
    degree_df = pd.DataFrame(degrees)
    
    # Filter out the nodes with fewer connections
    if filter is not None:
        top = deg_centrality_sort[:filter_nodesbydegree]["Character"].values # sort the top characters using filter_nodesbydegree
        
        G_dup.remove_nodes_from([node
                             for node in G_dup.nodes
                             if node not in top
                            ]) 

    pos = nx.spring_layout(G_dup)

    for n, p in pos.items():
        G_dup.nodes[n]['pos'] = p

    # Create edges 
    # Add edges as disconnected lines in a single trace and nodes as a scatter trace
    edge_trace = go.Scatter(
        x = [],
        y = [],
        line = dict(width = 0.5, color = '#888'),
        hoverinfo = 'none',
        mode = 'lines')

    for edge in G_dup.edges():
        x0, y0 = G_dup.nodes[edge[0]]['pos']
        
        x1, y1 = G_dup.nodes[edge[1]]['pos']
        
        edge_trace['x'] += tuple([x0, x1, None])
        
        edge_trace['y'] += tuple([y0, y1, None])

    node_trace = go.Scatter(
        x = [],
        y = [],
        text = [],
        mode = 'markers',
        hoverinfo = 'text',
        marker = dict(
            showscale = True,
            colorscale = 'RdBu',
            reversescale = True,
            color = [],
            size = 15,
            colorbar = dict(
                thickness = 10,
                title = 'Node Connections',
                xanchor = 'left',
                titleside = 'right'
            ),
            line = dict(width = 0)))

    for node in G_dup.nodes():
        x, y = G_dup.nodes[node]['pos']
        
        node_trace['x'] += tuple([x])
        
        node_trace['y'] += tuple([y])

    # Color node points by the number of connections
    for node, adjacencies in enumerate(G_dup.adjacency()):
        node_trace['marker']['color'] += tuple([int(degree_df[degree_df[0] == adjacencies[0]][1].values)])
        
        node_info = adjacencies[0] + '<br /># of connections: ' + str(int(degree_df[degree_df[0] == adjacencies[0]][1].values))
        
        node_trace['text'] += tuple([node_info])

    # Create a network graph
    fig = go.Figure(data = [edge_trace, node_trace],
                 layout = go.Layout(
                    title = '<br>GOT network connections',
                    titlefont = dict(size = 20),
                    showlegend = False,
                    hovermode = 'closest',
                    margin = dict(b = 20, l = 5, r = 5, t = 0),
                    annotations=[ dict(
                        text = "",
                        showarrow = False,
                        xref = "paper", yref = "paper") ],
                    xaxis = dict(showgrid = False, zeroline = False, showticklabels = False),
                    yaxis = dict(showgrid = False, zeroline = False, showticklabels = False)))

    pyo.iplot(fig)
In [45]:
draw_plotly_network_graph(Graph_obj = G, filter = None, filter_nodesbydegree = None)

All books combined (Top 50 Characters)¶

In [21]:
draw_plotly_network_graph(Graph_obj = G, filter = "Yes", filter_nodesbydegree = 50)

Tyrion Lannister is the most connected character across the book series, followed by Jon Snow and Jamie Lannister.¶

Book 1¶

In [22]:
draw_plotly_network_graph(Graph_obj = G1, filter = "Yes", filter_nodesbydegree = 50) 
#Top 50 characters network in Book 1

Eddard Stark is the most connected character, followed by Robert Baratheon.¶

Tyrion, Catelyn, and Jon are in the top 5 characters.¶

Rob, Sansa, and Bran are all well-connected too, but the first book mostly revolves around Ed Stark and Robert Baratheon.¶

Cersei Lannister, Joffrey Baratheon, Jamie Lannister, Arya Stark, Daenerys, and Drogo are the other well-connected characters in this book.¶

In [23]:
deg_central(G1)[:20]
Out[23]:
Character Degree Centrality
0 Eddard-Stark 0.35
1 Robert-Baratheon 0.27
2 Tyrion-Lannister 0.25
3 Catelyn-Stark 0.23
4 Jon-Snow 0.20
5 Robb-Stark 0.19
6 Sansa-Stark 0.19
7 Bran-Stark 0.17
8 Cersei-Lannister 0.16
9 Joffrey-Baratheon 0.16
10 Jaime-Lannister 0.16
11 Arya-Stark 0.15
12 Petyr-Baelish 0.14
13 Tywin-Lannister 0.12
14 Daenerys-Targaryen 0.11
15 Jory-Cassel 0.11
16 Drogo 0.10
17 Rodrik-Cassel 0.10
18 Renly-Baratheon 0.10
19 Luwin 0.10

Book 2¶

In [24]:
draw_plotly_network_graph(Graph_obj = G2, filter = "Yes", filter_nodesbydegree = 50)

Tyrion Lannister has become the central character, followed by Joffrey Baratheon and Cersei Lannister.¶

Arya Stark has started gaining prominence with her being connected to Bran and Robb Stark.¶

Catelyn Stark has been pushed down from the top 5, but Robb Stark and Theon Greyjoy have gained importance.¶

Robert Baratheon and Eddard Stark have lost a huge amount of importance because they both died at the end of the first book.¶

In [25]:
deg_central(G2)[:20]
Out[25]:
Character Degree Centrality
0 Tyrion-Lannister 0.21
1 Joffrey-Baratheon 0.18
2 Cersei-Lannister 0.17
3 Arya-Stark 0.16
4 Stannis-Baratheon 0.14
5 Robb-Stark 0.14
6 Catelyn-Stark 0.13
7 Theon-Greyjoy 0.12
8 Renly-Baratheon 0.12
9 Bran-Stark 0.12
10 Jon-Snow 0.11
11 Sansa-Stark 0.10
12 Robert-Baratheon 0.10
13 Eddard-Stark 0.09
14 Jaime-Lannister 0.08
15 Varys 0.08
16 Daenerys-Targaryen 0.07
17 Amory-Lorch 0.07
18 Sandor-Clegane 0.07
19 Tywin-Lannister 0.07

Book 3¶

In [26]:
draw_plotly_network_graph(Graph_obj = G3, filter = "Yes", filter_nodesbydegree = 50)

Tyrion Lannister remains the most central character, followed by Jon Snow & Joffrey Baratheon.¶

Jon Snow has risen multiple places and is one of the most connected characters in Book 3, second only to Tyrion Lannister.¶

Sansa Stark & Jaime Lannister have also gained prominence.¶

Robb Stark is also in the top 5 most connected characters.¶

In [27]:
deg_central(G3)[:20]
Out[27]:
Character Degree Centrality
0 Tyrion-Lannister 0.20
1 Jon-Snow 0.17
2 Joffrey-Baratheon 0.17
3 Robb-Stark 0.16
4 Sansa-Stark 0.16
5 Jaime-Lannister 0.15
6 Catelyn-Stark 0.13
7 Cersei-Lannister 0.13
8 Arya-Stark 0.12
9 Stannis-Baratheon 0.10
10 Samwell-Tarly 0.10
11 Tywin-Lannister 0.10
12 Robert-Baratheon 0.09
13 Daenerys-Targaryen 0.08
14 Mance-Rayder 0.07
15 Gregor-Clegane 0.07
16 Sandor-Clegane 0.07
17 Aemon-Targaryen-(Maester-Aemon) 0.06
18 Jeor-Mormont 0.06
19 Davos-Seaworth 0.06

Book 4¶

In [28]:
draw_plotly_network_graph(Graph_obj = G4, filter = "Yes", filter_nodesbydegree = 50)

An interesting insight here is that the most connected character in Book 4 is Jaime Lannister followed by Cersei.¶

Brienne and Tyrion Lannister follow them but are way below them in terms of actual connections and Degree Centrality values.¶

Arya Stark is no longer in the top 10.¶

In [29]:
deg_central(G4)[:20]
Out[29]:
Character Degree Centrality
0 Jaime-Lannister 0.23
1 Cersei-Lannister 0.22
2 Brienne-of-Tarth 0.10
3 Tyrion-Lannister 0.10
4 Margaery-Tyrell 0.09
5 Sansa-Stark 0.09
6 Tommen-Baratheon 0.09
7 Samwell-Tarly 0.07
8 Stannis-Baratheon 0.07
9 Petyr-Baelish 0.07
10 Victarion-Greyjoy 0.06
11 Arianne-Martell 0.06
12 Tywin-Lannister 0.06
13 Arya-Stark 0.06
14 Osmund-Kettleblack 0.05
15 Pycelle 0.05
16 Robert-Arryn 0.05
17 Aeron-Greyjoy 0.05
18 Qyburn 0.05
19 Robert-Baratheon 0.05
In [30]:
betweenness_central(G4)[:20]
Out[30]:
Character Betweenness Centrality
0 Stannis-Baratheon 0.24
1 Balon-Greyjoy 0.19
2 Jaime-Lannister 0.18
3 Baelor-Blacktyde 0.17
4 Cersei-Lannister 0.17
5 Tyrion-Lannister 0.17
6 Sansa-Stark 0.16
7 Arya-Stark 0.12
8 Samwell-Tarly 0.12
9 Tywin-Lannister 0.10
10 Myrcella-Baratheon 0.09
11 Sandor-Clegane 0.09
12 Brienne-of-Tarth 0.09
13 Doran-Martell 0.07
14 Victarion-Greyjoy 0.07
15 Catelyn-Stark 0.06
16 Aurane-Waters 0.06
17 Tommen-Baratheon 0.05
18 Randyll-Tarly 0.05
19 Leo-Tyrell 0.05

Book 5¶

In [31]:
draw_plotly_network_graph(Graph_obj = G5, filter = "Yes", filter_nodesbydegree = 50)

As expected, Jon Snow and Daenerys are the most connected characters in this book.¶

Stannis, Tyrion, and Theon Greyjoy follow them.¶

If you look closely, Stannis Baratheon (orange node in the middle) seems to be connecting multiple groups, i.e., he has high Betweenness Centrality.¶

In [32]:
deg_central(G5)[:20]
Out[32]:
Character Degree Centrality
0 Jon-Snow 0.20
1 Daenerys-Targaryen 0.18
2 Stannis-Baratheon 0.15
3 Tyrion-Lannister 0.10
4 Theon-Greyjoy 0.10
5 Cersei-Lannister 0.09
6 Barristan-Selmy 0.08
7 Hizdahr-zo-Loraq 0.07
8 Asha-Greyjoy 0.06
9 Melisandre 0.05
10 Jon-Connington 0.05
11 Quentyn-Martell 0.05
12 Mance-Rayder 0.05
13 Ramsay-Snow 0.05
14 Aegon-Targaryen-(son-of-Rhaegar) 0.05
15 Robert-Baratheon 0.05
16 Daario-Naharis 0.05
17 Doran-Martell 0.05
18 Selyse-Florent 0.05
19 Wyman-Manderly 0.04
In [33]:
betweenness_central(G5)[:20]
Out[33]:
Character Betweenness Centrality
0 Stannis-Baratheon 0.36
1 Daenerys-Targaryen 0.25
2 Jon-Snow 0.21
3 Robert-Baratheon 0.20
4 Asha-Greyjoy 0.17
5 Tyrion-Lannister 0.16
6 Cersei-Lannister 0.14
7 Godry-Farring 0.10
8 Tywin-Lannister 0.10
9 Barristan-Selmy 0.08
10 Eddard-Stark 0.08
11 Theon-Greyjoy 0.07
12 Doran-Martell 0.07
13 Axell-Florent 0.07
14 Wyman-Manderly 0.06
15 Bowen-Marsh 0.05
16 Aegon-Targaryen-(son-of-Rhaegar) 0.05
17 Mance-Rayder 0.05
18 Bran-Stark 0.05
19 Theomore 0.04

Character Evolutions in Books¶

In [34]:
# Creating a list of degree centrality of all the books
Books_Graph = [G1, G2, G3, G4, G5]

evol = [nx.degree_centrality(Graph) for Graph in Books_Graph]

# Creating a DataFrame from the list of degree centralities in all the books
degree_evol_df = pd.DataFrame.from_records(evol)

degree_evol_df.index = degree_evol_df.index + 1

# Plotting the degree centrality evolution of few important characters
fig = px.line(degree_evol_df[['Eddard-Stark', 'Tyrion-Lannister', 'Jon-Snow', 'Jaime-Lannister', 'Cersei-Lannister', 'Sansa-Stark', 'Arya-Stark']],
             title = "Evolution of Different Characters", width = 900, height = 600)

fig.update_layout(xaxis_title = 'Book Number',
                   yaxis_title = 'Degree Centrality Score',
                 legend = {'title_text': ''})

fig.show()

Eddard Stark was the most popular character in Book 1, but he was killed at the end of the book.¶

Overall, from all five books, Tyrion Lannister is the most popular character in the series.¶

There is a sudden increase in Jon Snow's popularity in Book 5.¶

Jaime and Cersei Lannister remain central characters throughout.¶

Sansa & Arya's importance is high in the first few books, but it decreases thereafter.¶

Community Detection¶

In [35]:
import community as community_louvain
import matplotlib.cm as cm
import colorlover as cl
In [36]:
partition = community_louvain.best_partition(G, random_state = 12345)
partition_df = pd.DataFrame([partition]).T.reset_index()
partition_df.columns = ["Character", "Community"]
partition_df
Out[36]:
Character Community
0 Aegon-V-Targaryen 0
1 Aemon-Targaryen-(Maester-Aemon) 0
2 Alleras 1
3 Alliser-Thorne 0
4 Andrey-Dalt 2
... ... ...
791 Yorko-Terys 7
792 Ysilla 8
793 Yurkhaz-zo-Yunzak 10
794 Zei 0
795 Zollo 5

796 rows × 2 columns

Community Distribution¶

In [37]:
partition_df["Community"].value_counts().sort_values(ascending = False)
Out[37]:
8     136
0     114
9     113
5     110
10     89
6      70
12     68
7      51
2      25
1      11
13      3
11      2
3       2
4       2
Name: Community, dtype: int64
In [38]:
colors = cl.scales['12']['qual']['Paired']

def scatter_nodes(G, pos, labels = None, color = 'rgb(152, 0, 0)', size = 8, opacity = 1):
    # pos is the dictionary of node positions
    # labels is a list  of labels of len(pos), to be displayed when hovering the mouse over the nodes
    # color is the color for nodes. When it is set as None, the Plotly's default color is used
    # size is the size of the dots representing the nodes
    # opacity is a value between 0 and 1, defining the node color opacity
    trace = go.Scatter(x = [], 
                    y = [],  
                    text = [],   
                    mode = 'markers', 
                    hoverinfo = 'text',
                           marker = dict(
            showscale = False,
            colorscale = 'RdBu',
            reversescale = True,
            color = [],
            size = 15,
            colorbar = dict(
                thickness = 10,
                xanchor = 'left',
                titleside = 'right'
            ),
            line = dict(width = 0)))
    
    for nd in G.nodes():
        x, y = G.nodes[nd]['pos']
        trace['x'] += tuple([x])
        trace['y'] += tuple([y])
        color = colors[partition[nd] % len(colors)]
        trace['marker']['color'] += tuple([color])
        
    for node, adjacencies in enumerate(G.adjacency()):
        node_info = adjacencies[0]
        trace['text'] += tuple([node_info])

    return trace    

def scatter_edges(G, pos, line_color = '#a3a3c2', line_width = 1, opacity = .2):
    trace = go.Scatter(x = [], 
                    y = [], 
                    mode = 'lines'
                   )
    
    for edge in G.edges():
        x0, y0 = G.nodes[edge[0]]['pos']
        x1, y1 = G.nodes[edge[1]]['pos']
        trace['x'] += tuple([x0, x1, None])
        trace['y'] += tuple([y0, y1, None])
        trace['hoverinfo'] = 'none'
        trace['line']['width'] = line_width
        
        if line_color is not None:              
            trace['line']['color'] = line_color
    
    return trace
In [39]:
def visualize_community(Graph, filter = "Yes", filter_nodes = 100):
    G_dup = G.copy()
    degrees = nx.classes.degree(G_dup)
    degree_df = pd.DataFrame(degrees)
    
    if filter is not None:
        top = deg_centrality_sort[:filter_nodes]["Character"].values
        G_dup.remove_nodes_from([node
                             for node in G_dup.nodes
                             if node not in top
                            ])

    pos = nx.spring_layout(G_dup, seed = 1234567)

    for n, p in pos.items():
        G_dup.nodes[n]['pos'] = p

    trace1 = scatter_edges(G_dup, pos, line_width = 0.25)
    trace2 = scatter_nodes(G_dup, pos)
    
    fig = go.Figure(data = [trace1, trace2],
             layout = go.Layout(
                title = '<br> GOT Community Detection',
                titlefont = dict(size = 20),
                showlegend = False,
                hovermode = 'closest',
                margin = dict(b = 20, l = 5, r = 5, t = 40),
                annotations = [ dict(
                    text = "",
                    showarrow = False,
                    xref = "paper", yref = "paper") ],
                xaxis = dict(showgrid = False, zeroline = False, showticklabels = False),
                yaxis = dict(showgrid = False, zeroline = False, showticklabels = False)))
    
    iplot(fig)
In [40]:
visualize_community(Graph = G, filter = "Yes", filter_nodes = 100)

The Louvain method was able to find 14 different communities.¶

Here are some descriptions of a couple of communities:¶

The yellow nodes represent Dothraki consisting of Drogo, Danaerys, Nahaaris, etc.¶

The light purple nodes represent Tyrion, Cersei, Tywin, Joffrey, Sansa, etc.¶

The red nodes consist of Robb, Catelyn, Brienne, Jaime, etc.¶

Arya is coupled with Gendry and Beric-Dondarrion in the orange colored nodes.¶

The light blue nodes represent another very important community consisting of the Night's Watch, including Jon Snow, Jeor-Mormont, Samwell-Tarly, Gilly, Bowen-Marsh, etc.¶

Similar inferences can be made for other nodes as well.¶

In [ ]: